The p value is the probability that random fluctuations alone could produce a t value at least as
large as the value you just calculated based upon the Student t distribution.
The Student t statistic is always calculated using the general equation D/SE. Each specific type of t test
we discussed earlier — including one-group, paired, unpaired, and Welch — calculates D, SE, and df
slightly differently. These different calculations are summarized in Table 11-1.
TABLE 11-1 How t Tests Calculate Difference, Standard Error, and
Degrees of Freedom
One-Group
Paired
Unpaired t Equal Variance
Welch t Unequal Variance
D
Difference between mean of
observations and a hypothesized value
(h)
Mean of
paired
differences
Difference between means of the two
groups
Difference between means of the two
groups
SE SE of the observations
SE of paired
differences
SE of difference, based on a pooled
estimate of SD within each group
SE of difference, from SE of each
mean, by propagation of errors
df
Number of observations – 1
Number of
pairs – 1
Total number of observations – 2
“Effective” df, based on the size and
SD of the two groups
Executing a t test
Statistical software packages contain commands that can execute (or run) t tests (see Chapter 4
for more about these packages). The examples presented here use R, and in this section, we
explain the data structure required for running the various t tests in R. For demonstration, we use
data from the National Health and Nutrition Examination Survey (NHANES) from 2017–2020
file (available at wwwn.cdc.gov/nchs/nhanes/continuousnhanes/default.aspx?
Cycle=2017-2020).
For the one-group t test, you need the column of data containing the variable whose mean you
want to compare to the hypothesized value (H), and you need to know H. R and other software
enable you to specify a value for H and assumes 0 if you don’t specify anything. In the NHANES
data, the fasting glucose variable is LBXGLU, so the R code to test the mean fasting glucose
against a maximum healthy level of 100 mg/dL in an R dataframe named GLUCOSE is
t.test(GLUCOSE$LBXGLU, mu = 100).
For the paired t test, you need two columns of data representing the pair of numbers you want to
enter into the paired t test. For example, in NHANES, systolic blood pressure (SBP) was
measured in the same participant twice (variables BPXOSY1 and BPXOSY2). To compare these
with a paired t test in an R dataframe named BP, the code is t.test(BP$BPXOSY1, BP$BPXOSY2,
paired = TRUE).
For the independent t test, you need to have one column coded as the grouping variable
(preferable with a two-state flag coded as 0 and 1), and another column with the value you want to
test. We created a two-state flag in the NHANES data called MARRIED where 1 = married and 0
= all other marital statuses. To compare mean fasting glucose level between these two groups in
a dataframe named NHANES, we used this code: t.test(NHANES$LBXGLU ~